516 research outputs found

    A Discrete Logarithm-based Approach to Compute Low-Weight Multiples of Binary Polynomials

    Full text link
    Being able to compute efficiently a low-weight multiple of a given binary polynomial is often a key ingredient of correlation attacks to LFSR-based stream ciphers. The best known general purpose algorithm is based on the generalized birthday problem. We describe an alternative approach which is based on discrete logarithms and has much lower memory complexity requirements with a comparable time complexity.Comment: 12 page

    RIME: Repeat Identification

    Get PDF
    We present an algorithm for detecting long similar fragments occurring at least twice in a set of biological sequences. The problem becomes computationally challenging when the frequency of a repeat is allowed to increase and when a non-negligible number of insertions, deletions and substitutions are allowed. We introduce in this paper an algorithm, Rime1 1 Rime is also a reference to Coleridge's poem "The Rime of an Ancient Mariner" which contains many repetitions as a poetic device. (for Repeat Identification: long, Multiple, and with Edits) that performs this task, and manages instances whose size and combination of parameters cannot be handled by other currently existing methods. This is achieved by using a filter as a preprocessing step, and by then exploiting the information gathered by the filter in the following actual repeat inference step. To the best of our knowledge, Rime is the first algorithm that can accurately deal with very long repeats (up to a few thousands), occurring possibly several times, and with a rate of differences (substitutions and indels) allowed among copies of a same repeat of 10-15% or even more

    Longest property-preserved common factor

    Get PDF
    In this paper we introduce a new family of string processing problems. We are given two or more strings and we are asked to compute a factor common to all strings that preserves a specific property and has maximal length. Here we consider two fundamental string properties: square-free factors and periodic factors under two different settings, one per property. In the first setting, we are given a string x and we are asked to construct a data structure over x answering the following type of on-line queries: given string y, find a longest square-free factor common to x and y. In the second setting, we are given k strings and an integer 1 < k’ ≤ k and we are asked to find a longest periodic factor common to at least k’ strings. We present linear-time solutions for both settings. We anticipate that our paradigm can be extended to other string properties

    Two truncating variants in FANCC and breast cancer risk

    Get PDF

    Polymorphism rs4919510:C>G in Mature Sequence of Human MicroRNA-608 Contributes to the Risk of HER2-Positive Breast Cancer but Not Other Subtypes

    Get PDF
    BACKGROUND: A few polymorphisms are located in the mature microRNA sequences. Such polymorphisms could directly affect the binding of microRNA to hundreds of target mRNAs. It remains unknown whether rs4919510:C>G located in the mature miR-608 alters breast cancer susceptibility. METHODS: The association of rs4919510:C>G with risk and pathologic features of breast cancer were investigated in two independent case-control studies, the first set including 1,138 sporadic breast cancer patients (including 927 invasive ductal carcinoma patients, 777 of them with known subtypes: 496 luminal-like, 133 HER2-positive, and 148 triple-negative) and 1,434 community-based controls, and the second set including 294 familial/early-onset breast cancer patients and 500 hospital-based cancer-free controls. Odds ratios (ORs) were estimated by logistic regression. Predicted targets of miR-608 and complementary sequences containing rs4919510:C>G were surveyed to reveal potential pathological mechanism. RESULTS: In the first set, although rs4919510:C>G was unrelated to breast cancer in general patients, variant genotypes (CG/GG) were specifically associated with increased risk of HER2-positive subtype (Adjusted OR = 1.97, 95% CI, 1.34-2.90 in the recessive model). Variant G-allele was the risk allele with OR of 1.62 (95% CI, 1.23-2.15). Patients carrying GG-genotype also had larger HER2-positive tumors (P for Kruskal-Wallis test = 0.006). The relationship between rs4919510:C>G and risk of HER2-positive subgroup was validated in the second set (Bonferroni corrected P = 0.06). The adjusted combined OR (total 164 HER2-positive cases) in the recessive model was 1.97 (95% CI, 1.43-2.72) for GG genotype (corrected P = 1.1 × 10(-4)). Bioinformatic analysis indicated that, HSF1, which is required for HER2-induced tumorigenesis, might be a target of miR-608. The minimum free-energy of ancestral-miR-608 (C-allele) binding to HSF1 is -35.9 kcal/mol, while that of variant-form (G-allele) is -31.5 kcal/mol, indicating a lower affinity of variant-miR-608 to HSF1 mRNA. CONCLUSION: rs4919510:C>G in mature miR-608 may influence HER2-positive breast cancer risk and tumor proliferation

    Clinical and pathologic characteristics of BRCA-positive and BRCA-negative male breast cancer patients: results from a collaborative multicenter study in Italy

    Get PDF
    Recently, the number of studies on male breast cancer (MBC) has been increasing. However, as MBC is a rare disease there are difficulties to undertake studies to identify specific MBC subgroups. At present, it is still largely unknown whether BRCA-related breast cancer (BC) in men may display specific characteristics as it is for BRCA-related BC in women. To investigate the clinical-pathologic features of MBC in association with BRCA mutations we established a collaborative Italian Multicenter Study on MBC with the aim to recruit a large series of MBCs. A total of 382 MBCs, including 50 BRCA carriers, were collected from ten Italian Investigation Centres covering the whole country. In MBC patients, BRCA2 mutations were associated with family history of breast/ovarian cancer (p < 0.0001), personal history of other cancers (p = 0.044) and contralateral BC (p = 0.001). BRCA2-associated MBCs presented with high tumor grade (p = 0.001), PR- (p = 0.026) and HER2+ (p = 0.001) status. In a multivariate logistic model BRCA2 mutations showed positive association with personal history of other cancers (OR 11.42, 95 % CI 1.79-73.08) and high tumor grade (OR 4.93, 95 % CI 1.02-23.88) and inverse association with PR+ status (OR 0.19, 95 % CI 0.04-0.92). Based on immunohistochemical (IHC) profile, four molecular subtypes of MBC were identified. Luminal A was the most common subtype (67.7 %), luminal B was observed in 26.5 % of the cases and HER2 positive and triple negative were represented by 2.1 % and 3.7 % of tumors, respectively. Intriguingly, we found that both luminal B and HER2 positive subtypes were associated with high tumor grade (p = 0.003 and 0.006, respectively) and with BRCA2 mutations (p = 0.016 and 0.001, respectively). In conclusion, our findings indicate that BRCA2-related MBCs represent a subgroup of tumors with a peculiar phenotype characterized by aggressive behavior. The identification of a BRCA2-associated phenotype might define a subset of MBC patients eligible for personalized clinical management

    Association of low-penetrance alleles with male breast cancer risk and clinicopathological characteristics: results from a multicenter study in Italy

    Get PDF
    It is well-known that male breast cancer (MBC) susceptibility is mainly due to high-penetrance BRCA1/2 mutations. Here, we investigated whether common low-penetrance breast cancer (BC) susceptibility alleles may influence MBC risk in Italian population and whether variant alleles may be associated with specific clinicopathological features of MBCs. In the frame of the Italian Multicenter Study on MBC, we genotyped 413 MBCs and 745 age-matched male controls at 9 SNPs annotating known BC susceptibility loci. By multivariate logistic regression models, we found a significant increased MBC risk for 3 SNPs, in particular, with codominant models, for rs2046210/ESR1 (OR = 1.71; 95 % CI: 1.43-2.05; p = 0.0001), rs3803662/TOX3 (OR = 1.59; 95 % CI: 1.32-1.92; p = 0.0001), and rs2981582/FGFR2 (OR = 1.26; 95 % CI: 1.05-1.50; p = 0.013). Furthermore, we showed that the prevalence of the risk genotypes of ESR1 tended to be higher in ER- tumors (p = 0.062). In a case-case multivariate analysis, a statistically significant association between ESR1 and ER- tumors was found (OR = 1.88; 95 % CI: 1.03-3.49; p = 0.039). Overall, our data, based on a large and well-characterized MBC series, support the hypothesis that common low-penetrance BC susceptibility alleles play a role in MBC susceptibility and, interestingly, indicate that ESR1 is associated with a distinct tumor subtype defined by ER-negative status

    Optimal neighborhood indexing for protein similarity search

    Get PDF
    Background: Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves heuristics with large seed indexes. In order to speed up this technique, the index can be enhanced by storing additional information to limit the number of random memory accesses. However, this improvement leads to a larger index that may become a bottleneck. In the case of protein similarity search, we propose to decrease the index size by reducing the amino acid alphabet.\ud \ud Results: The paper presents two main contributions. First, we show that an optimal neighborhood indexing combining an alphabet reduction and a longer neighborhood leads to a reduction of 35% of memory involved into the process, without sacrificing the quality of results nor the computational time. Second, our approach led us to develop a new kind of substitution score matrices and their associated e-value parameters. In contrast to usual matrices, these matrices are rectangular since they compare amino acid groups from different alphabets. We describe the method used for computing those matrices and we provide some typical examples that can be used in such comparisons. Supplementary data can be found on the website http://bioinfo.lifl.fr/reblosum.\ud \ud Conclusions: We propose a practical index size reduction of the neighborhood data, that does not negatively affect the performance of large-scale search in protein sequences. Such an index can be used in any study involving large protein data. Moreover, rectangular substitution score matrices and their associated statistical parameters can have applications in any study involving an alphabet reduction
    • …
    corecore